Reversible transformations from pure to mixed states, and the unique measure of 

information 
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Transformations from pure to mixed states are usually associated with information loss and irre- 
versibility. Here, a protocol is demonstrated allowing one to make these transformations reversible. 
The pure states are diluted with a random noise source. Using this protocol one can study opti- 
mal transformations between states, and from this derive the unique measure of information. This 
is compared to irreversible transformations where one does not have access to noise. The ideas 
presented here shed some light on attempts to understand entanglement manipulations and the 
inevitable irreversibility encountered there where one finds that mixed states can contain "bound 
entanglement" . 



I. INTRODUCTION 

There are two opposing pictures of information. In the 
first picture, a source produces a large amount of infor- 
mation if it has large entropy. Thus information can be 
associated with entropy. This is because the receiver is 
being informed only if he is "surprised" . In such an ap- 
proach the information has a subjective meaning: some- 
thing which is known by the sender, but is not known by 
receiver. The receiver treats the message as the informa- 
tion, if she didn't know it. 

One can consider a different approach to information - 
an objective one where a system represents information 
if it is in pure state (zero entropy). The state is itself 
the information. This view is more natural in the con- 
text of thermodynamics. There, "knowledge is power" 
in the sense that one can draw work from a single heat 
bath by use of systems in known pure states 0. On the 
other hand, the heat bath is represented by a maximally 
entropic state, hence it is the less informative one. The 
pure state represents information needed to order the en- 
ergy of the heat bath. 

There can be many candidates for functions to measure 
information. However, Shannon recognized that there is 
a unique function that shares some natural properties 
to describe information. Shannon, derived his unique 
measure based on the subjective picture of information. 
Therefore his information function (Shannon entropy) in- 
creases as the dispersion of the probability distribution 
increases. The same is true of the generalization of Shan- 
non's entropy to the quantum case which is the von Neu- 
mann's entropy S{g) = —Tiglogg. 

One can consider a measure of objective information, 
that has the converse tendency: namely / = log d — H 
where log d is the maximal entropy of the system (i.e. the 
system has d states). In the quantum case it would be 
logd — S{g). Such a function was naturally interpreted 
as the information contents of the state as introduced by 
Brillouin. 

One can ask the question: can this function be derived 
independently of the notion of entropy, so that it is not 
just a subtraction of two known terms, but rather has an 



autonomous meaning? 

It turns out that there is such a possibility and it is of- 
fered by quantum information theory: in Ref. ^ we have 
derived the function / as the unique one, that does not 
increase under some class of operations. The motivation 
came from considering information as a resource in dis- 
tributed systems^- The main aim of the present paper 
is to present the full rigorous version of that derivation. 
In the process, we give a protocol for reversible transfor- 
mations between states using a random source of noise. 
We also discuss these results in the context of the issue 
of reversibility and entanglement theory. 

It is quantum information theory (QIT) that provides 
us with a suitable perspective to attack the problem. In- 
deed, one of the central themes of QIT is the idea of op- 
timal transitions between states under a restricted class 
of operations. This originates from attempts to describe 
entanglement of quantum states. Although it was diffi- 
cult to say what exactly entanglement was, it was clear 
that it could not increase under the class of operations 
made up of local operations and classical communication 
(LOCC)ll H. These operations allow one to use any 
amount of separable states for free, but do not allow one 
to create entangled states. One can take the converse 
point of view: one starts with a given class of operations 
(LOCC operations), and treat the states that are not 
free as containing a resource, which can be called entan- 
glement (cf. 1^). The basic questions of entanglement 
theory is: can state g be transformed into cr by LOCC? 
What is the optimal rate of such a transition? 

In entanglement theory, this allowed one to define a 
number of measures of entanglement, since essentially, 
any function which does not increase under LOCC is a 
measure. However, thus far, no one has found a unique 
measure. The essential difficulty (as will become clearer) 
is that operations under LOCC are not reversible. How- 
ever, if one has a restricted class of operations for which 
transitions are reversible, then we will see that the rate 
of transitions gives one a unique measure. This is similar 
to pure bipartite state entanglement where we have re- 
versibility, and there is unique measure of entanglement 
(entropy of subsystem) P 
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In the present work, we consider a restricted class of 
operations we shall call Noisy Operations (NO) |2j and 
use this to develop a unique measure for information. Es- 
sentially, we consider operations where one is allowed to 
use random noise as a free resource. Perhaps counter- 
intuitively, randomness allows one to make the transfor- 
mations reversible: the number of pure states and noise 
which is needed to form the state, is the same as the 
amount that can be obtained from the state. The usual 
interpretation of mixed states, is that there creation in- 
volves irreversibly destroying information. Here we see 
that if one has access to noise as a resource, then there 
is no irreversibility. 

This has interesting consequences concerning entangle- 
ment theory, since there, the irreversibility is often asso- 
ciated with the fact that one is dealing with mixed states. 
Here, we see that transitions into mixed states need not 
involve irreversibility. In fact, the axiomatic structure 
of the paradigm presented here involving mixed states 
is very similar to pure state entanglement manipulation. 
This shows that apriori, mixed state entanglement ma- 
nipulation need not involve irreversibility, leaving open 
the question of why entanglement manipulation involves 
inevitable irreversibilities. 

Other restricted classes of operations may lead one to 
find unique measures for other quantities. Here, we con- 
sider the optimal transitions between states by means of 
NO. In the asymptotic limit of many identical copies, we 
will obtain that there is only one function that does not 
increase under NO. We will establish that the optimal ra- 
tio of conversion between a state g oi a N qubit system 
and a state cr of a iV' qubit system is equal to -^rz^^- 
The transitions are reversible, even though mixed states 
are involved. Finally we will consider operations with- 
out free noisy ancillas. Then the mixed states have to 
be created from pure states by partial trace, which intro- 
duces irreversibility. We discuss the implications of our 
results on understanding entanglement transformations, 
especially bound entanglement. 

The work is organized as follows: In Section^ we in- 
troduce the class of Noisy Operations. Then in Section 
mil we show how one can transform a given state into 
another state, under NO, provided certain conditions are 
met. Inlrvl we go to the asymptotic regime, and show 
that these transition rates are optimal. This will allow 
us to find the unique measure of information in SectionFVl 
In Section IVll we discuss the case of transitions without 
access to noise, and give the transition rates in this case. 
We discuss this in terms of understanding the source of 
irreversibility in transitions, and relate it to attempts to 
understand entanglement in Section IVIII We conclude 
with some open questions in Section IVIIII 

II. NOISY OPERATIONS 

Perhaps the most important restricted class of oper- 
ations which has been considered in quantum informa- 



tion theory is LOCC, which was introduced in the con- 
text of understanding entanglement in shared systems. 
One is then interested in such questions such as how 
many maximally entangled states can a particular state 
be transformed into (i.e. the rate of distilling singlet). 
However, analyzing LOCC operations proved rather dif- 
ficult. Therefore, to facilitate the investigation of entan- 
glement, a larger classes of operations were analyzed - so 
called PPT operations pll fllL which are superopera- 
tors which preserve the positivity of partial transpose. 

One can also consider other restricted classes of oper- 
ations, and consider various versions of the state trans- 
formation problem. On the extreme end, one allows all 
operations, and adding any ancilla. Then any state can 
be created for free, so that there is no resources to be 
manipulated, and the theory becomes trivial. 

As one knows any operation can be composed out of a 
unitary operation, adding an ancilla in some state, and 
removing ancilla. Suppose that we want to make the the- 
ory nontrivial, while keeping all unitaries in our class of 
allowable operations. The only way is then to restrict the 
state of the free ancilla, or somehow restrict removing an- 
cillas. In the present work, we consider only restrictions 
to the free ancilla. While one could instead consider re- 
strictions on removing ancillas, we believe that this would 
give identical results 13] . 

Thus we will restrict to choosing states that can be 
added for free by means of ancillas. Remarkably, the 
choice of which ancillas to allow is forced on us. It turns 
out that the only choice that does not make the the- 
ory trivial is that the free ancilla must be in maximally 
mixed state. Essentially, we will see in Section rVIII that 
if one allows any other ancilla, then all transition rates 
become infinite. This fixes the class of operations we will 
call noisy operations (NO). The class NO is therefore 
very natural, as it is the only one which gives non-trivial 
transition rates. 

In entanglement theory, an entangled state of Schmidt 
rank 2 represents the same resource whether it acts on a 
Hilbert space C^(g)C^ or on a larger space C^^C^ . This 
is because embedding a state into a larger Hilbert space 
is equivalent to adding local ancillas in a pure state. In 
our case, a state acting on a Hilbert space is not the 
same resource as the one acting on C'*. This is because 
adding ancilla in pure state is adding a new resource. 



III. OPTIMAL TRANSITIONS UNDER NOISY 
OPERATIONS - SINGLE COPY CASE 

In this section we will present a protocol to transform 
single copies of states into each other by diluting them 
with noise. We will show that the transition from a single 
copy of p to a single copy of state a is possible if and 
only if the latter is more mixed than the former. This is 
provided the Hilbert space is the same for both states, 
i.e. they occupy the same number of qubits. We will 
also consider the transitions between systems of different 
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number of qubits. One then has to add maximally mixed 
ancillas to one of the systems (or to both), so that the 
number of qubits become equal. Then we can apply the 
above criterion. The term "more mixed" 0| has the 
following meaning: For states g and g' on the Hilbert 
space Ti — C^, we say that g is more mixed than g' 
{g >- g') if their eigenvalues in decreasing order satisfy 

-^fe — Si=i ^'k^ ^ — "^i™ (Ii^ the same way, 

one can say that some probability distribution is more 
mixed than another one). If the state is more mixed, its 
eigendistribution is more spread. The order introduced 
by the relation 'V" has a largest element - the maximally 
mixed state. It is easy to see that it is more mixed than 
any other state. 

Let us now prove the main result of this section. 

Proposition 1 For states g and a of d-level systems the 
transition g a by NO is possible if and only if g )^ a. 

Proof. follows from the fact that cr ;^ p iff 

there exists a bistochastic map |l6j that maps g into a. 
Since noisy operations (for equal input and output di- 
mensions) are bistochastic, then g a implies a )^ g. 
To prove "<;=" we cannot use the result of jl^, because 
we do not know if the existing map can be taken to be 
noisy operations. Instead we will construct the map ex- 
plicitly. Let us then assume that a >- g. First we can 
always rotate g unitarily, so that it commutes with a. 
Thus we can assume without loss of generality that the 
states commute. We can now use the fact 17'] that if 



probability distribution {qi} is more mixed than {pi}, 
then the former can be obtained from the latter via a 
mixture of permutations, i.e. 

= ^ajp^^.(i) (1) 
j 

where Uj — 1 , while aj are permutations of indices 
of the probability distribution. Let then pi be the eigen- 
values of g and qi - eigenvalues of a. We will consider 
state p (E" T/v (where tjv is an added maximally mixed 
state of dimension N) and construct some permutation 
of eigenvalues of the latter density matrix. After such 
permutation, and removing the ancilla, the state will 
approach a for large d. For simplicity we will assume 
that there are only two permutations cri and a2, so that 
qi = Q;Pffi(i) + (1 - a)PCT2(i)- 

The state g^r^ consists of blocks, of dimensions N: 

-^{ pi,- - j,Pu ---, Pd,--j,Pd) (2) 

N N 

We will divide each block into two groups of entries: Ni 
first entries and the rest N2 ^ N ~ Ni entries. Now 
we will apply permutation ai to the first entries of each 
block. Similarly, we apply it to the second set of entries, 
and so on, in the first group. The second group is sub- 
jected to permutation (T2 in a similar way. The resulting 
density matrix is 



,Pai{l),Pa2il), ■ ■ ■ ;P<T2(1) ■ • ■ iPai{N), ■ ■ ■ , Pai(N) , Pa^iN) , ■ ■ ■ iPa^iN)) 



(3) 



JVi 



N2 



Ni 



N2 



Now we trace out the ancilla. This means that we sum 
all elements of each block, and instead of the block, take 
the resulting number. The obtained eigendistribution is 
given by 



q^ - -J^Paiir) 



N2 

^ Pcr2(i)- 



(4) 



Choosing large N and suitable iVi , N2 one can approach 
a and 1 — a with arbitrarily high accuracy. This ends 
the proof of the proposition. 



Usually it is not possible to obtain a perfect state cr®™" 
from £1®" even if an arbitrarily large amount of copies 
can be used. It is however possible to obtain the state 
(T„ that will asymptotically converge to u®™" 



(6) 



Thus we allow for inaccuracy, provided it vanishes in the 
limit of large n. The fidelity can be measured by the 
trace norm, i.e. one requires that 



IV. OPTIMAL TRANSITIONS UNDER NOISY 
OPERATIONS - ASYMPTOTIC REGIME 

Here we will consider asymptotic transitions of type 



for 



(7) 



The rate of given protocol of asymptotic g ^ a transition 
is given by the asymptotic ratio lim„ . The optimal 
transition rate denoted by R{g a) is given by supre- 
mum over rates attainable by protocols that satisfy the 
asymptotic accuracy condition Q. 
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A. Conversion from mixed to pure states 



where D is given by 



We will now consider the optimal rate for transition to 
the one qubit pure state tt i.e. g ^ n. We will show that 
if p is a state of c?-level system then 



Rig ^ tt) = I{g) 



(8) 



where I — N — S{g) with N — logd being the amount 
of qubits occupied by the state g. In other words, the 
transformation from pure states to mixed states is re- 
versible, in the sense that the number of pure states which 
is needed, or which can be obtained is the same. The 
proof could be just use of Schumacher compression [18| . 
however with a different interpretation (similar to that 
in 01). We will also show that conversely, the amount 
of copies in state g that can be obtained under NO per 
input pure qubit is also equal to /. The proofs will be 
similar to the reasoning of Nielsen in [23| where he de- 
rived asymptotic rates of pure state entanglement ma- 
nipulations from single copies based on majorization. 

We will use law of large numbers [HIIJ, that implies 
that there exists a subset of eigenvalues of p®" call the 
typical set TYP with useful properties. More precisely, 
given e,S > 0, there exists large enough n, and the set 
TYP of eigenvalues such that 



PiGTYP 



Pi>l-e 



(9) 



2~n{s+s) < < 2-»(s-'5) for e TYP (10) 

These are thus the eigenvalues that carry almost the 
whole weight and they are more or less uniform. One 
can consider two states gtyp and gatyp, given by 



gtyp 



PiETYP 



Pi^TYP 



P^\i){i\ 

(11) 

where \i) are eigenvectors corresponding to pi, and c = 
'^PiGTYpPi is a normalization constant. Clearly g®" is 
a mixture of those states 



g = Cgtyp + (1 - C)gatyp 



(12) 



Since c > 1 — e one finds that gtyp is close to g'^": 

\\gtyp-g'''"\\<2e (13) 

Thus it suffices to use gtyp instead of gi®". Let us 
first show that one can convert gtyp into approximately 
n{N — S) copies of pure qubits. To this end, note that 
the eigenvalues of gtyp satisfy: 



c c 



(14) 



Thus gtyp is less mixed than the state gout with eigenval- 
ues 



(15) 




D 



1 



(16) 



(the eigenvectors of gout are irrelevant, as we can perform 
any unitary transformation for free). Both of the states 
act on a d" dimensional space, so that we can apply our 
Prop. ^ Thus it is possible to go from gtyp to gout via 
noisy operations. If we choose D to be larger than in 
eq. (|15|l . namely so that it is a power of 2, the transition 
is still possible. The smallest such D satisfies \ogD = 
\n{S + S)~\ < n{S + 6) + 1. Then the state gout represents 
exactly the tensor product of log D qubits in maximally 
mixed state and n log d— log D > n{\ogd—S—6) — l qubits 
in pure states. Thus one can remove the mixed qubits, 
and keep the obtained pure qubits. Call the obtained 
state TTout- The rate of the transition is the number of 
obtained pure qubits divided by n. For large n this tends 
to log d — S — S. Since S can be chosen arbitrarily small, 
we obtain the optimal asymptotic rate equal to logd — S. 

One could think that we obtain the pure qubits exactly. 
However, we used Proposition ^ where the transition is 
not exact, though arbitrarily precise. 

Yet we have not transformed but gtyp- We now 
take instead of gtyp, the state gi®" and apply the same 
action, which transformed gtyp into the required amount 
of pure qubits (call the action A). It is now easy to see 
that A(p®") is close to a final state of pure qubits iTout- 
Indeed, we have 

\\A{g^n-^out\\ = \\A{g^n~HQtyp)\\ < < e 

(17) 

where the second last inequality comes from the fact 
that completely positive trace-preserving maps are con- 
tractions on Hermitian operators in trace norms, i.e. 
||A(A)|| < II All for Hermitian A 

Now we should show that the converse is possible, i.e. 
to create a state p®" it is sufhcient to start with log d — S 
pure qubits per output copy of g. However, the proof is 
similar to the above. The only difference is that we now 
use the other part of eq. (|10|l . Namely, we note that gtyp 
is more mixed than the state with eigenvalues 



(18) 




where D' is given by 



1 



l2-n{S-S) 



(19) 



Again, due to Proposition ^ we can turn gtyp into the 
latter state. Changing D' into a suitable power of 2 (so 
that it is smaller than D' of the above equation hence 
passing from gtyp is still possible) one gets that the latter 
state is a tensor product of logD' qubits in maximally 
mixed states and approximately n{\ogd — S) qubits in 
pure states. 
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Thus starting with n(log d — S) qubits in a pure state, 
one has to add logD' qubits in the maximaUy mixed 
state, and pass to the state gtyp which can be made ar- 
bitrarily close to p®" by choosing small e. 

B. Optimality of logd — S transition rates and 
optimal mixed-mixed transition rates 

We will now show that the obtained rates are optimal. 
We will follow Ref. 7] invoking standard thermodynam- 
ical reasoning concerning Carnot efSciency (cf. 8]). Es- 
sentially, we will show that I = N — S cannot increase 
under NO maps, and then show that if our transitions are 
not optimal, one could increase / under NO. We will use 
the reversibility of our protocol, and also the asymptotic 
continuity property of von Neumann entropy. 

We will prove optimality by contradiction. Suppose 
that for the transition to pure qubits — > tt one can 
obtain a better rate than R{g — > tt) ^ N — S (where 
N = logd, g acts on C^). Then one can run the following 
transition 

TT ^ g^TT, (20) 

and obtain a rate of such transition which is more than 
1. In other words, employing n < m{N — S) pure qubits, 
according to the assumption, one gets m pairs in state g. 
Then one can apply the protocol of the previous section 
to the m pairs of g, to obtain r7i(iV — S) pure qubits. 
Thus one would be able to increase the number of pure 
qubits from n to m. Repeating the procedure one can 
obtain an arbitrary number of pure qubits. 

Now, we have to show that this is impossible. This 
follows from the fact that N — S cannot increase un- 
der NO maps. Indeed, unitary maps do not change the 
quantity. Partial trace of one qubit decreases iV by 1, 
and can increase entropy at most by 1. Finally, adding 
a system in maximally mixed state, increases A'^ by 1, 
but also increases entropy by 1. Now, for m pure qubits, 
N ^ S = m, while for n qubits wc have N — S = n < m, 
thus the function N — S must increase. 

This is yet not the full proof, as we have made an 
implicit assumption, that the final qubits are exactly pure 
states. In fact it is not true, as all our conversions are only 
asymptotically true. However the von Neumann entropy 
is asymptotically continuous, namely for N qubit states 
g and a we have 

\S{g)-S{a)\<N\\g-<7\\+0{l) (21) 

In our case we take g = tt®™ and a,n being the actual 
final state. We know then that S{g) ~ and that ||(T,„ — 
g\\ tends to zero as m goes to infinity. Thus Mfnili ^ o 
for large m. Thus the density of the function I tends to 
1 for the state am- This density is also 1 for the initial 
state TT®". Thus we can write that in our process lout = 
TO„ — o(m„); on the other hand = n. We will show 
that for large n, (which also implies that m„ is large) 



lin < lout- Indeed that latter inequality is equivalent to 
the following set of equivalent inequalities 

TO„ — o(to„) > n (22) 

mn _ o(rrtn) ^ ^ 

n n 

ahLn _ °("^") ) > I 
n TO„ 

The quantity inside the bracket tends to 1, while in our 
protocol ^ goes to a number greater than one. Thus 
the inequality holds, which is impossible. Therefore our 
assumption that our rate is not optimal is incorrect. In a 
similar way one can show that one cannot obtain a better 
rate than 

R{7T -^g) = j (23) 

while going from pure states to mixed ones. 

Clearly since the transitions from mixed to pure states 
are reversible and optimal, one can use these protocols 
to go from one mixed state to another in a reversible 
and optimal way by just distilling pure states and then 
creating another mixed state. This gives that the optimal 
ratio of conversion between state g of a N qubit system 
and state cr of a A^' qubit system is equal to 

R{g^a)^N~ S{g) 
N'~S{a) . 



V. INFORMATION MONOTONES AND THE 
UNIQUE MEASURE OF INFORMATION 

Here we will derive the unique measure of informa- 
tion J, with virtually no assumptions. The derivation 
will be mostly operational. We will actually assume two 
properties. The first will concern the intuition of what 
information is - namely, noisy operations should not in- 
crease it. Indeed, information, whatever it is, shouldn't 
be increased by unitary operations, by adding a qubit in 
maximally mixed state (supposed to be information-lcss) 
and discarding qubit (rather obvious requirement). Thus 
we postulate 

Postulate 1. / should be monotonic under noisy op- 
erations. 

We will actually see in the next section, that this pos- 
tulate is rigid, in the sense that if instead of noisy op- 
erations, we had chosen operations with a free resource 
other than maximally mixed states, the theory would be 
trivial, and all rates would be infinite. 

The second assumption will not be connected with the 
expected properties of information. Rather it will display 
the properties any function used in the asymptotic regime 
(limit of many copies) should possess. I.e. 

Postulate 2. / is asymptotically continuous. 
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By asymptotically continuous, one means that for the 
state qn and apf of N qubits, such that Wqn — (Tjv|| 
for N —>■ oo. One would then require 



0. 



(25) 



We then say that / is asymptotically continuous. The 
motivation for this is that in the asymptotic regime, one 
identifies the states that asymptotically converge to each 
other. Thus the only relevant functions of states are those 
that also somehow identify those states. Of course in 
the asymptotic limit, the interesting functions become 
infinite, so that one has to pass to intensive quantities 
and divide by the number of copies to obtain densities. 
The relevant functions would be those whose densities 
converge on convergent sequences. Note that this not 
merely a technical requirement. Rather this follows from 
the basic assumption of the asymptotic regime - that sim- 
ilar states should be identified. The latter assumption is 
necessary, and physically natural - it is simply impossible 
to obtain exact transitions. 

Let us now prove that there is a unique function that 
satisfies these two postulates. 

The proof can be obtained from Refs. p^. |25| . Ac- 
cording to "25] The following inequality is true 



R{g -^ct)< 



/°°(g) 

/-(a) 



(26) 



where R denotes the rate of transition under any given 
class of operations, and / is an asymptotically continuous 
function nonincreasing under the class. The symbol 00 
stands for regularization. The regularization of function 
f{Q) is M°°{q) = lim„^o, iM(f?®"). 

Choosing as a the one qubit pure state tt and exchang- 
ing the roles of g and a we obtain 



R{g 



tt) < 



R{iT ^ g) < 



(27) 



rig) 



Denoting l//°°(7r) = a we obtain 
R{g^TT)<ar{g) < 



1 



i?(7r ^ g) 



(28) 



However we have explicit protocols which show that 
R{g tt) > I and 1/R{tt g) < I. Thus up to the 
constant a we obtain that f°° — I. In this sense / is the 
unique measure of information. 

It is interesting to see how other measures of informa- 
tion are removed in the asymptotic limit. Suppose that 
we consider measures of information which only satisfy- 
ing the first postulate. Since we see that everything is 
very similar to the problem of pure state entanglement, 
one is not surprised that all monotones under NO are 
so called Shur concave functions of the density matrix. 
In particular there is a set of information measures (or 



"monotones") which is enough to determine if a transi- 
tion is possible. These are the so called Ky Fan k-norms, 
i.e. sums of the first k largest eigenvalues. By definition 
of the "more mixed" condition, we have g a iS for 
all fc-norms, \\g\\k ^ ll'^'llfc- Thus the process a —^ g is 
possible iff in the process no monotone increases. 

One might get the feeling that there is some contra- 
diction here. Namely, in asymptotic transitions, the only 
restriction for the rate is the monotone /. Thus there are 
allowed transitions for which other monotones increase. 
Indeed, we say that g"^^ a®™ is possible, though it 
is clear that some of the monotones will increase. The 
solution is that, in fact we are not talking about exact 
transitions. Thus in the actual transition, the final state 
obeys the nonincreasing of monotones. For that state, all 
monotones are not greater than for the initial state. The 
monotones are however not asymptotically continuous, 
and they see differences between that actual state, and 
the required state cr®™. The only monotone that does 
not see the difference is /. Therefore only this function 
survives in the asymptotic limit. 



A. The choice of free resource is unique 

One could think that the way we have obtained the 
information measure is not fully operational, as we as- 
sumed, somewhat arbitrarily that the free resource is the 
maximally mixed state. Here we will show that this is 
the only reasonable choice, if we want to allow ancillas 
at all, and if the theory is to be nontrivial i.e. the tran- 
sition rates are finite, and therefore, not all states can be 
obtained for free. We thus assume that our operations 
include unitary transformations, and partial trace, and 
will try to play with third component - adding ancillas. 

Suppose that instead of maximally mixed states r, we 
chose any other state go as a free resource. This means 
that we can use arbitrarily many copies of this state. 
From g^^ we can produce without use of noise pure states 
by Schumacher compression I8„ ,19,] (in this paper we 
have not described this - we always used noise). Thus 
we have pure states for free. From pure states we can 
produce noise by entangling two qubits in a maximally 
entangled state and rejecting one qubit. The remaining 
one will be in a maximally mixed state. This is not very 
efficient: we spend two qubits in a pure state to get one 
qubit of noise. However pure states are for free, hence 
this method is sufficiently good in our situation. Now 
we have both noise and pure states for free, hence via 
the protocol described in the previous section, we can 
create any state. The theory becomes trivial - all states 
are for free; all rates are infinite. Thus if we allow adding 
systems for free at all, we can only add ones in maximally 
mixed state. We thus see that Postulate 1 is rather rigid, 
in the sense that changing it to a class of operations which 
allows any other ancilla, will result in a trivial theory. 
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VI. REVERSIBILITY AND IRREVERSIBILITY 

Note that we have a kind of reversibihty: the amount 
of pure qubits that can be drawn from a given state is 
equal to the amount we need to create the state. Let us 
consider another situation, where we count everything 
(no free resource), we then see that there is basic irre- 
versibihty: transitions from almost any state g to any 
other state is irreversible. For example, one can draw / 
pure qubits from g, but to create g, one needs many more 
pure qubits. There are two reasons for this. The first rea- 
son is trivial - to get N qubits in state g one needs TV 
qubits anyway. This is 1 qubit per output qubits, which 
is already more than I = N — S{g). Now, however, even 
more pure qubits are needed. Namely, the output state 
has nonzero entropy. However the only way of producing 
entropy out of pure states is rather wasteful: one en- 
tangles two qubits, and removes one of them (as already 
described in the previous section). Indeed, previously, we 
had a free source of entropy - maximally mixed states, 
now we have only pure states to our disposal, and we 
count them. 

Interestingly, in the classical world there is no way to 
produce entropy at all. Therefore in classical statistical 
mechanics, one has to assume mixed state from the very 
beginning. Quantum mechanics allows one to produce 
mixed states out of pure ones. This may lead one to 
prefer Bayes concept of probability. 

We will now show that 

Proposition 2 N + S pure qubits are necessary and suf- 
ficient to produce g if one doesn't have access to noise. 

That this is sufficient can be seen by noting that g can 
be creating by preparing the purification of gtyp. We thus 
consider a pure state of two systems A and B. Subsys- 
tem A has N qubits, and its state is gtyp- The state of 
subsystem B (the purification) is also gtyp, but we do not 
need it to be an N qubit system, but rather want it to 
use the smallest possible amount of qubits. The latter is 
equal to S qubits. Thus N + S qubits in pure state are 
needed to prepare gtyp (preparation is discarding the sys- 
tem B). That this number of qubits are necessary simply 
stems from the fact that we start from an initially pure 
state, so to get a mixed state we must trace out part of 
the initial system, and the "garbage" that gets traced out 
must have at least S qubits (since the number of qubits of 
garbage cannot be less than it's entropy, and the garbage 
must have entropy S since the system is initially pure). 
We must also have at least N qubits left over to form the 
state. So, in general, to create the N qubit state g we 
need N + S pure qubits, but we can draw only N ~ S 
qubits. The "information of preparation" is much greater 
than "information of distillation". During the transition 

1/; e V (29) 
we lose 25' pure qubits. 



Proposition 3 To produce the mixed-mixed transition 
p — > cr, without access to noise, AN -\- AS qubits are 
necessary and sufficient where AN = N{a) — N(g) and 
AS = S{a) - S{g) 

To see these resources are necessary, we note that a 
general protocol involves an initial state g^ \'ip) where 
lip) is some initial pure state. One then performs uni- 
taries to give a state g', and then one traces out the 
garbage g to leave the state a. We can then use the 
triangle inequality 

\S{<7)-S{g)\<S{g') = S{g) (30) 

to see that the number of garbage bits traced out N{g) 
satisfies iV(5) > S{g) > AS (if 5(5) > S{a) then trivially 
S{g) > AS. So, we need a minimum of N{a) + AS pure 
qubits to create a, but we already had N{p) bits to start, 
so the minimum amount of additional qubits needed is 
AN + AS. 

The protocol which realizes this bound is to reversibly 
distill g into N{g) — S{g) pure qubits and S{g) bits of 
noise in a manner which we shall shortly describe. We 
then add in an additional AN — AS pure qubits. How- 
ever, we also need AS bits of noise, which costs 2A5' 
pure states (this is the only part of the protocol which 
is irreversible). We then create a reversibly as described 
in the previous section, using the AA^ -1- AS* additional 
qubits. 

The distillation procedure can be realized using a 
scheme similar to quantum data compression 1 1 8j and to 
the concentration of entanglement scheme of Ref. Q 
(here however, the procedure is applied to the entire 
state). The protocol is essentially a projective measure- 
ment onto blocks proportional to the identity. On aver- 
age, the size of the Hilbert space that the state is pro- 
jected onto will be of size S{g), and so, the state can then 
be unitarily rotated to leave N{g) — S{g) pure states. We 
will explicitly give the protocol for n qubits i.e. N(g) = 1 
but the extension to higher dimensional states is straight- 
forward. 

We can write the state in the eigenbasis which we label 
as and 1, i.e. g = a|0)-|-6|l). We have have n copies, i.e. 
we operate on the state g®", and then we measure how 
many zeros this state has. This is a measurement with 
n -\- 1 outcomes and it will yield a result k = 0, ...,n 
telling us how many zeros there are. This projects us 
onto a state which has dk = (^) basis vectors, all with 
equal coefiicients. I.e. it is proportional to the identity. 
The probability of finding a particular outcome k is pk — 
(^') a^'^6^("~*'') and since it does not in general span the 
entire Hilbert space, can be unitarily transformed to yield 
Ik — n — logdfc pure states. 

Each process g — > {pk, Pk} after which Ik pure states 
is extracted from pk with probability pk, provides 

iVo = ^Pfc4-i?(M) (31) 

k 

total pure states. The Shannon entropy H{{p}) of distri- 
bution {pk} equals the cost of the erasure of information 
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which allows us to work with an ensemble of pk's [26|. 
Thus we need ler = H{{p}) bits of erasure to pay for the 
next part of the scheme, in which they draw J2k Pkh pure 
states. This quantity, which is of order logn is negligible 
in the large n limit. We can divide the above equation 
by n to obtain the amount of extractable pure states per 
qubit. 

No/n = 1 - Sig) (32) 

where the erasure cost has been neglected since it is of 
order \ogn/n. This completes our proof of the proposi- 
tion. 

This allows one to think of states in the following way: 
the mixed state consists oi N — S bits of information and 
S bits of noise. Thus to produce it one needs N — S 
qubits in pure states, to account for information, and 25* 
qubits to produce noise. Indeed one bit of noise costs two 
pure qubits - since noise is produced by rejecting part of 
entangled system. 

It is interesting that one needs to add a free resource 
(noise) in order to achieve efficient transitions from pure 
to mixed states which are much less "useful" than mixed- 
to-pure transitions. Indeed, the latter is a task that 
can be associated with such actions as cooling, error- 
correction, increasing signal. This useful task can be per- 
formed without the help of an additional resource at the 
optimal rate. Only the converse direction, which is not 
useful (who wants to have mixed states instead of pure 
ones?) needs noise, and is much less efficient without 
noise. 

There are other cases where reversibility needs noise. 
For example according to the Shannon second theorem, 
one can simulate one use of noiseless channel by 1 /C uses 
of a noisy channel of capacity C. However, one cannot 
do the converse, i.e. simulate noisy channels by noise- 
less one, without sharing random correlated data [27| . 
Again, the useful task does not need any additional re- 
source, while the useless task needs one. This is clear, 
if one realizes that in both situation we deal with dilu- 
tion of some valuable resource into noise. Similarly in 
thermodynamics, the the thermodynamical system with 



difference of temperature can be thought as being "pure 
energy" (such as mechanical energy) diluted into "pure 
heat" . To draw work out of it one does not need any ad- 
ditional resource. However to create the system of heat 
baths efficiently, one needs a heat reservoir at the begin- 
ning. Otherwise, one has to spend work to produce heat, 
exactly as we needed to spend pure states to produce 
noise. 



VII. DISCUSSION: COMPARISON WITH 
ENTANGLEMENT TRANSFORMATIONS 

The paradigm discussed in this paper may be useful 
to understand the problems of entanglement theory. As 
one knows there is a basic irreversibility in entanglement 
transformations. We deal there with bipartite systems, 
shared by distant parties. One is interested in how many 
pure singlets are needed to form a state pab (the en- 
tanglement cost, and also, how many singlets can be ob- 
tained from the state (the distillable entanglement). If 
Pab is pure, then the entanglement cost is equal to the 
distillable entanglement in the limit of many copies of 
Pab 8J- The tranformations are reversible. However it 
is known that for a number of mixed states, the distillable 
entanglement is not equal to the entanglement cost psf . 
One has irreversibility. It has generally been assumed 
that this is because one is making transformations be- 
tween pure states (in this case, singlets), and the mixed 
state Pab- One therefore expects some information loss. 
However, as we have seen here, one can make transfor- 
mations between pure and mixed states completely re- 
versible, provided one has access to noise. And indeed, 
in the paradigm of entanglement theory, there is no rea- 
son why two distant parties couldn't share some initial 
noisy resource. There is no special apriori reason for ir- 
reversibility in entanglement theory. It is therefore inter- 
esting to compare the situation discussed here, with that 
of entanglement theory. This comparison is summerized 
in the following table, and described below 



Paradigm 


Class of Operations 


Free Resource 


Expensive Resource 


Reversible? 


information 


NO 


maximally mixed states 


pure states 


yes 


pure state entanglement 


LOCC 


separable states 


singlets 


yes 


mixed state entanglement 


LOCC 


separable states 


singlets 


no 


thermodynamics 


adiabatic processes 


heat [29] 


work 


yes 


ToEfSOj 


LOCC + PPT states 


PPT states 


singlets 


no (?) 


PPT [22j 


PPT operations 


PPT states 


singlets 


in some cases 



r 



Instead of NO, in entanglement theory we have there added, 3) any local partial trace can be performed 4) 
LOCC, which means that 1) arbitrary local unitary op- qubits can be communicated between distant parties only 
erations can be performed, 2) any local ancilla can be via a dephasing channel. The role of noise is played by 
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separable states - all the states that can be produced 
for free within the allowed class of operations are a free 
resources. The role of pure states is played by pure en- 
tangled states. 

One could imagine that like with "local information 
theory" , in entanglement theory, any state is a reversible 
mixture of two phases: pure entanglement and a separa- 
ble noisy phase. One should be able to draw the same 
amount of pure entanglement from a given state as is 
needed to produce it. Creation of mixed states would 
be reversible dilution of pure entanglement into mixed, 
separable states. 

In this simple picture we would have only two kinds 
of basic elements in entanglement theory: pure entan- 
gled systems and disentangled systems. One is useful, 
the other - useless. A state which is neither pure en- 
tangled nor disentangled, consists of those two basic el- 
ements. This is in parallel to the paradigm presented in 
this paper, where the useful elements were pure states, 
the useless maximally mixed ones. 

As noted, such a situation exists for pure states, 
where we can reversibly concentrate and dilute entan- 
glement. However, such a situation does not exist with 
mixed states in entanglement theory. What is the ba- 
sic difference between mixed state entanglement and the 
paradigms (I) of pure entanglement, and (II) the present 
NO one? 

In both I and II wc have the following common point. 
We define states that can be added for free, and then the 
class of operations. Then in both cases it turns out that 
the free states remain the only nontrivial set of states 
closed under the class of operations. Now in mixed state 
entanglement we may have another basic element - bound 
entangled ones. One cannot obtain them from separable 
states, but also one cannot obtain any pure entangle- 
ment from them. Thus the set of states closed under the 
class of operations is greater than it would seem from 
the construction of the paradigm. Thus in situations I 
and II we have only two elements: useful and useless. In 
paradigm II the useful element is information, the use- 
less one - noise. In paradigm II - the useful element is 
entanglement, the useless - separability. Here, entangle- 
ment itself is divided into at least to phases: bound and 
pure. From bound entanglement we cannot make pure, 
so call it useless as well. Thus we can have states that 
have entanglement, but are useless. This is different than 
I and II, but similar to thermodynamics: we have there 
two forms of energy, useful and useless. In Ref. we 



have asked a question - is it possible that mixed-state 
entanglement is like thermodynamics. There would be 
three basic elements: separable states (no entanglement), 
bound entanglement and pure entanglement, similarly as 
in thermodynamics there are states without energy, with 
disordered energy (single heat bath) and with ordered 
energy (mechanical energy). All three kinds could be 
reversibly mixed. 

In [sJI it was shown that such a picture can be treated 
as a sort of "first order approximation" rather than full 
description of asymptotic bipartite entanglement. Re- 
lated questions were studied in Ref. [s^ where reversibil- 
ity for some states holds, if so called PPT superoparators 
are allowed 10] . The relation between the latter result 
and the "thermodynamic" approach of Ref. [sj goes 
beyond the scope of this paper and is explained in [31| 
itself. 



VIII. CONCLUSION 

Contrary to what might be imagined, we have shown 
that mixed states do not necessarily impose irreversibil- 
ity. One can reversibly transform pure states into mixed 
ones, provided one has access to random noise. This de- 
fines a class of operations (NO) which can then be used 
to explore the transition rates between various states. It 
is found that the information measure I = N — S cannot 
decrease under NO, and is therefore the unique asymp- 
totically continuous measure of information. It would be 
extremely interesting to explore other restricted classes 
of operations in addition to NO, to see whether there are 
other non-trivial theories. Exploring the connection be- 
tween this, and the LOCC paradigm of entanglement the- 
ory, would be extremely useful in understanding entan- 
glement in distributed quantum systems. Perhaps ideas 
along the lines of [sj may prove fruitful. 
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